Skip to content

reduce: unalias arrays for hcat/vcat (map)reduce #58490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 22, 2025
Merged

Conversation

Keno
Copy link
Member

@Keno Keno commented May 21, 2025

I was using reduce(vcat, list_of_arrays) and then modifying the resulting array. This worked great, except that in the case when length(list_of_arrays) == 1, the resulting array would alias the input, resulting in the input being modified and causing havoc down the road. Of course the proper fix for this is better built-in support for immutable arrays (and possibly memory ownership) to encode those assumptions explicitly. However, while this hack is unsatisfying, I do think it is somewhat defensible. One argument hcat and vcat perform the copy, as do the various optimized special cases for these functions. Of course, this is fragile and would break as soon as someone puts it in a closure, but maybe it saves at least a little bit of headache.

I considered a more aggressive change to make the generic fallback reduce_first(op, x) = op(x), but in discussion with Jeff, this was deemed too breaking, since it would add an assumption that op has a one-arg form.

I was using `reduce(vcat, list_of_arrays)` and then modifying the resulting
array. This worked great, except that in the case when `length(list_of_arrays) == 1`,
the resulting array would alias the input, resulting in the input being modified
and causing havoc down the road. Of course the proper fix for this is better built-in
support for immutable arrays (and possibly memory ownership) to encode those assumptions
explicitly. However, while this hack is unsatisfying, I do think it is somewhat defensible.
One argument `hcat` and `vcat` perform the copy, as do the various optimized special cases
for these functions. Of course, this is fragile and would break as soon as someone puts it
in a closure, but maybe it saves at least a little bit of headache.

I considered a more aggressive change to make the generic fallback `reduce_first(op, x) = op(x)`,
but in discussion with Jeff, this was deemed too breaking, since it would add an assumption
that `op` has a one-arg form.
Keno added a commit that referenced this pull request May 21, 2025
In #58490, I said:
```
I considered a more aggressive change to make the generic fallback reduce_first(op, x) = op(x), but in discussion with Jeff, this was deemed too breaking, since it would add an assumption that op has a one-arg form.
```

However, what if we just checked whether `op` does have an appropriate
method. This would generalize the fix also. For example,
`reduce(+, list_of_arrays)` has the same aliasing issue that I
complained about in #58490. In addition, this lets us remove the
various reduce_first special cases, since the one arg reduction
op has the correct behavior for all existing cases.
@Keno Keno added the triage This should be discussed on a triage call label May 21, 2025
@Keno
Copy link
Member Author

Keno commented May 22, 2025

Or maybe we should do reduce_first(op, x::Array) = copy(x)?

@adienes adienes added the fold sum, maximum, reduce, foldl, etc. label May 22, 2025
@mbauman
Copy link
Member

mbauman commented May 22, 2025

I like this as is. It also fixes another problem: the specific example in #34380, which isn't about aliasing but rather return values (and types) themselves. It's rather interesting that this crops up so very often with the *cat family and not much elsewhere — every single link back to that issue is about *cat.

I think that's a pretty good signal this is warranted and should be a targeted fixup like this.

@StefanKarpinski
Copy link
Member

I considered a more aggressive change to make the generic fallback reduce_first(op, x) = op(x), but in discussion with Jeff, this was deemed too breaking, since it would add an assumption that op has a one-arg form.

Maybe horrible option:

reduce_first(op, x) = applicable(op, x) ? op(x) : x

@Keno
Copy link
Member Author

Keno commented May 22, 2025

reduce_first(op, x) = applicable(op, x) ? op(x) : x

I had the same thought: #58491 but turns out op may be -.

@LilithHafner
Copy link
Member

Triage thinks that we definitely need this method.

@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label May 22, 2025
@mbauman mbauman merged commit d827fd7 into master May 22, 2025
10 checks passed
@mbauman mbauman deleted the kf/reducecopy branch May 22, 2025 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fold sum, maximum, reduce, foldl, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants